Search CORE

5 research outputs found

Parameter Sharing Reinforcement Learning Architecture for Multi Agent Driving Behaviors

Author: Kaushik Meha
Krishna K. Madhava
S Phaniteja
Publication venue
Publication date: 17/11/2018
Field of study

Multi-agent learning provides a potential framework for learning and simulating traffic behaviors. This paper proposes a novel architecture to learn multiple driving behaviors in a traffic scenario. The proposed architecture can learn multiple behaviors independently as well as simultaneously. We take advantage of the homogeneity of agents and learn in a parameter sharing paradigm. To further speed up the training process asynchronous updates are employed into the architecture. While learning different behaviors simultaneously, the given framework was also able to learn cooperation between the agents, without any explicit communication. We applied this framework to learn two important behaviors in driving: 1) Lane-Keeping and 2) Over-Taking. Results indicate faster convergence and learning of a more generic behavior, that is scalable to any number of agents. When compared the results with existing approaches, our results indicate equal and even better performance in some cases

arXiv.org e-Print Archive

A Deep Reinforcement Learning Approach for Dynamically Stable Inverse Kinematics of Humanoid Robots

Author: Dewangan Parijat
Guhan Pooja
Krishna K Madhava
Phaniteja S
Sarkar Abhishek
Publication venue
Publication date: 31/01/2018
Field of study

Real time calculation of inverse kinematics (IK) with dynamically stable configuration is of high necessity in humanoid robots as they are highly susceptible to lose balance. This paper proposes a methodology to generate joint-space trajectories of stable configurations for solving inverse kinematics using Deep Reinforcement Learning (RL). Our approach is based on the idea of exploring the entire configuration space of the robot and learning the best possible solutions using Deep Deterministic Policy Gradient (DDPG). The proposed strategy was evaluated on the highly articulated upper body of a humanoid model with 27 degree of freedom (DoF). The trained model was able to solve inverse kinematics for the end effectors with 90% accuracy while maintaining the balance in double support phase

arXiv.org e-Print Archive

DiGrad: Multi-Task Reinforcement Learning with Shared Actions

Author: Dewangan Parijat
Krishna K Madhava
Phaniteja S
Ravindran Balaraman
Sarkar Abhishek
Publication venue
Publication date: 27/02/2018
Field of study

Most reinforcement learning algorithms are inefficient for learning multiple tasks in complex robotic systems, where different tasks share a set of actions. In such environments a compound policy may be learnt with shared neural network parameters, which performs multiple tasks concurrently. However such compound policy may get biased towards a task or the gradients from different tasks negate each other, making the learning unstable and sometimes less data efficient. In this paper, we propose a new approach for simultaneous training of multiple tasks sharing a set of common actions in continuous action spaces, which we call as DiGrad (Differential Policy Gradient). The proposed framework is based on differential policy gradients and can accommodate multi-task learning in a single actor-critic network. We also propose a simple heuristic in the differential policy gradient update to further improve the learning. The proposed architecture was tested on 8 link planar manipulator and 27 degrees of freedom(DoF) Humanoid for learning multi-goal reachability tasks for 3 and 2 end effectors respectively. We show that our approach supports efficient multi-task learning in complex robotic systems, outperforming related methods in continuous action spaces

arXiv.org e-Print Archive

Learning Coordinated Tasks using Reinforcement Learning in Humanoids

Author: Dewangan Parijat
Guhan Pooja
Krishna K Madhava
Phaniteja S
Sarkar Abhishek
Publication venue
Publication date: 09/05/2018
Field of study

With the advent of artificial intelligence and machine learning, humanoid robots are made to learn a variety of skills which humans possess. One of fundamental skills which humans use in day-to-day activities is performing tasks with coordination between both the hands. In case of humanoids, learning such skills require optimal motion planning which includes avoiding collisions with the surroundings. In this paper, we propose a framework to learn coordinated tasks in cluttered environments based on DiGrad - A multi-task reinforcement learning algorithm for continuous action-spaces. Further, we propose an algorithm to smooth the joint space trajectories obtained by the proposed framework in order to reduce the noise instilled during training. The proposed framework was tested on a 27 degrees of freedom (DoF) humanoid with articulated torso for performing coordinated object-reaching task with both the hands in four different environments with varying levels of difficulty. It is observed that the humanoid is able to plan collision free trajectory in real-time. Simulation results also reveal the usefulness of the articulated torso for performing tasks which require coordination between both the arms

arXiv.org e-Print Archive

CORACOCLAVICULAR LIGAMENT RECONSTRUCTION USING A SEMITENDINOSUS TENDON GRAFT WITH POLYESTER SUTURE NO. 5 (ETHIBOND) FOR TYPE-III ACROMIOCLAVICULAR DISLOCATION

Author: B. Mohan Kalyal Phaniteja
Chitrothu Venkata Murali Krishna
Padala V. V. S. N. Reddy
Pathri Sivananda
Pathri Sivananda
Sanjeev Kumar Kare
Tulasi Sudheer
Publication venue: 'Level Up Business Center'
Publication date: 01/03/2019
Field of study

BACKGROUND The aim of the study is to review the functional and radiological results of patients after coracoclavicular ligament reconstruction using a semitendinosus tendon graft for type-III acromioclavicular dislocation. MATERIALS AND METHODS Nine patients aged 21 to 50 (mean, 35) years with Rockwood Type-III acromioclavicular dislocation underwent coracoclavicular ligament reconstruction with autogenous semitendinosus tendon grafts. Patients were either active in sports or heavy manual workers. Assessments on shoulder function (using the Constant Score), wound size, pain (using Visual Analogue Scale), and reduction (using radiographs of both acromioclavicular joints) were made. RESULTS The mean follow-up period was 18 (range, 12–24) months; the mean time to return to work or sports was 16 (range, 12–20) weeks. The mean constant score was 94 (range, 90–98). The mean donor-site scar size was 4 cm and the mean pain score was 0. No major complication or donor-site morbidity was noted. There was one wound dehiscence. CONCLUSION Coracoclavicular ligament reconstruction using an autogenous semitendinosus tendon graft was safe in physically active patients having type-III acromioclavicular dislocation

Directory of Open Access Journals